Multiscale model of electronic behavior and localization in stretched dry DNA 
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When the DNA double helix is subjected to external forces it can stretch elastically to elongations 
reaching 100% of its natural length. These distortions, imposed at the mesoscopic or macroscopic 
scales, have a dramatic effect on electronic properties at the atomic scale and on electrical transport 
along DNA. Accordingly, a multiscale approach is necessary to capture the electronic behavior of the 
stretched DNA helix. To construct such a model, we begin with accurate density- functional- theory 
calculations for electronic states in DNA bases and base pairs in various relative configurations 
encountered in the equilibrium and stretched forms. These results are complemented by semi- 
empirical quantum mechanical calculations for the states of a small size [18 base pair poly(CG)- 
poly(CG)] dry, neutral DNA sequence, using previously published models for stretched DNA. The 
calculated electronic states are then used to parametrize an effective tight-binding model that can 
describe electron hopping in the presence of environmental effects, such as the presence of stray water 
molecules on the backbone or structural features of the substrate. These effects introduce disorder 
in the model hamiltonian which leads to electron localization. The localization length is smaller by 
several orders of magnitude in stretched DNA relative to that in the unstretched structure. 



I. INTRODUCTION 



Soon after Watson and Crick's discovery of the DNA 
double- helix structure [l| , Eley and Spivey |2[ introduced 
the notion of efficient charge transport along the stacked 
7r orbit als of the bases. The mechanism of charge trans- 
port has been the subject of numerous studies in the 
intervening years, with renewed interest fuelled recently 
by both biological and technological considerations. Over 
a decade ago, Barton and co-workers observed distance- 
independent charge transfer between DNA-intercalated 
transition-metal complexes [3] and argued that it would 
be relevant for biology and biotechnology. More recent 
electron transport experiments on DNA have yielded 
widely varying results, showing alternatively insulating 
behavior 0, H, EH, semiconducting behavior @, 

Ohmic conductivity flOl [III IT2L [l3[ , and proximity in- 
duced superconductivity [14]]. The large number of rel- 
evant variables endemic to such experiments, like the 
DNA-electrode contact, and the rich variety of structures 
that DNA can assume, are the causes of variability in the 
experimental measurements (for a recent review of trans- 
port theory and experiments see Ref. |15j). 

Specifically, there is a large diversity of the DNA forms 
in terms of its composition, length, and structure. Exper- 
iments done long ago, suggested that DNA substantially 
longer than its natural length (also referred to as "over- 
stretched DNA") can undergo a transition to an elon- 
gated structure up to twice the length of relaxed DNA 
jl6j ]. This was also confirmed by recent single molecule 
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stretching experiments [I?], EH, Gi|, which showed that 
the molecule can be reversibly stretched up to 90% of its 
natural length. Such important deformations of the dou- 
ble helix may occur in biological environments. Stretch- 
ing of DNA is also related to cellular processes, such as 
transcription and replication. For example, proteins of- 
ten induce important local distortions in the double helix 
while they diffuse along the molecule in search of their 
target sequences. The electronic and transport proper- 
ties of DNA are directly influenced by its different con- 
formations as well as by environmental factors, such as 
counter ions, impurities or temperature. A full account of 
these effects based on a realistic, atomic scale description 
of the structure and the electronic properties challenges 
the capabilities of theoretical models. 

Theoretical efforts to understand the electronic behav- 
ior and transport in DNA can be divided into two general 
categories: 

(i) Model calculations that use effective hamiltonians and 
master equations to describe the dynamics of electrons 
and holes in DNA (see, for instance, Refs. [2O, Hl|, [22|, 
|23|). Recent results [24| have led to considerable in- 
sights concerning the sequence-independent derealiza- 
tion of electronic states in DNA. The main limitation 
of such approaches lies in the difficulty of determining 
accurate values for the parameters in the effective hamil- 
tonians. 

(ii) Ab initio calculations that can provide an accurate 
and detailed description of the electronic features [25, 
l26l [27} . These approaches are typically limited to a small 
number of atoms due to computational costs, and cannot 
readily handle the full complexity of DNA molecules in 
various conformations. In particular, stretching of DNA 
can induce a very significant deviation from the B form 
which is stable under normal conditions in aqueous so- 
lution. Such structural distortions are bound to have a 
profound effect on the electronic behavior. A realistic 
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description of these effects makes it necessary to handle 
both the atomic scale features and the overall state of 
the macromolecule. 



In the present work, we address the problem of DNA 
stretching effects on the electronic states and the electron 
localization by providing a bridge between the two ex- 
tremes of the length scale; a similar methodology was re- 
cently used to study hole transfer in DNA [28[ . Theoreti- 
cally, there are different ways of pulling the opposite ends 
of the DNA strands, leading to different stretched DNA 
forms, which are determined largely by base pair reorien- 
tations. Here, we use the poly(CG)-poly(CG) structures 
obtained in the pioneering study of Lebrun and Lavery 
[29[ as the representative structure for stretching effects. 
This study modeled the adiabatic elongation of selected 
DNA molecules in two modes of stretching, correspond- 
ing to pulling on opposite 3'-3' ends or 5'-5' ends of the 
molecule: In the 3'-3' stretching mode, the DNA helix is 
unwound leading to a ribbon-like structure, while in the 
5 '-5' stretching mode the DNA helix contracts. 



We begin with a set of detailed calculations for the 
electronic structure of DNA bases (A,T,C,G) and repre- 
sentative base pairs (AT-AT, CG-CG, AT-CG, CG-GC) 
in various relative configurations, as they are likely to 
appear in the stretched forms, These calculations are 
based on density-functional theory [33, HH and serve 
to set the stage for more extensive calculations which 
employ successive levels of approximations necessary to 
handle the computational demands. Specifically, we ex- 
tract the salient features of electronic structure of the 
individual DNA bases and base pairs from the ab ini- 
tio calculations; these are compared to an efficient and 
realistic semi-empirical model [32], in order to establish 
the validity of the latter approach. At this intermediate 
scale, we consider an 18 base pair poly(CG)-poly(CG) 
DNA sequence which has been stretched by 30%, 60% 
and 90% relative to the natural length of the unstretched 
B form. The atomic structure of these forms has been 
established by Lebrun and Lavery (29[, using empirical 
interatomic potentials. We next use the information from 
this approximate description to build an effective hamil- 
tonian for the electronic behavior at much larger scales. 
This allows us to describe electron localization, due to 
the combined effects of stretching and environmental fac- 
tors, over mesoscopic to macroscopic length scales. The 
essence of the approach and the different scales involved 
are shown schematically in Fig. [TJ We emphasize that we 
address here issues related only to dry and neutral DNA 
structures, where the negatively charged groups on the 
backbone are passivated by protons, conditions that are 
relevant to the experiments we consider for comparison 
to our theoretical results; water molecules or counterions 
(such as Na + ) are not considered in our calculations. 



II. THEORETICAL METHODS 

A. Ab initio calculations 

As our first step toward establishing the electronic be- 
havior of dry, neutral DNA, we study the nature of elec- 
tronic states in individual bases and in base pairs. For 
these calculations we used three different implementa- 
tions of density- functional theory [30j|: a method that 
uses atomic-like orbitals as the basis [33j, one that uses 
plane waves |34[ and a third that uses a real-space grid 
[351 ]. In all three approaches, we used the same exchange- 
correlation functional in the local-density approxima- 
tion [31], for consistency and simplicity. More elabo- 
rate approximations to exchange-correlation effects, such 
as the generalized gradient approximation [36[, do not 
provide any improvement in describing the physics of 
these weakly interacting units. In each method we used 
pseudopotentials to represent the atomic cores, of the 
Trouiller-Martins type [37[ in SIESTA, the Vanderbilt 
ultrasoft type [38] in VASP and the Hammann-Schluter- 
Chiang type [39] in HARES, with computational pa- 
rameters (number of orbitals in basis, plane- wave ki- 
netic energy cutoff and grid spacing) that ensure a high 
level of convergence. These calculations provide a thor- 
ough check on the consistency of various computational 
schemes to reproduce the electronic features of interest. 
The results are in excellent agreement across the three 
approaches. Since in these calculations there are no ad- 
justable parameters, we refer to them in the following as 
ab initio results. 



B. Construction of semi-empirical model 

The stretched forms contain a large number of atoms, 
typically beyond what can be efficiently treated with 
the ab initio methods used for the DNA bases and base 
pairs. Accordingly, for the electronic structure calcula- 
tions of these structures we use an efficient semi-empirical 
quantum-mechanical approach which employs a minimal 
basis set [32]. The consistency of this approach is then 
verified against the ab-initio calculations. Within the 
semi-empirical scheme, the electronic eigenfunctions are 
expressed as 

l^) ) = ^ c wi^) (i) 

V 

where the basis set \(p n ) includes the s and p atomic or- 

(n) 

bitals for each atom in the system. The coefficients c)> 
are numerical constants, with |c[/ n ^| 2 giving the weight of 
orbital \cp n ) to the electronic wavefunction. This method 
uses a second order expansion in the electronic den- 
sity to obtain the total energy and takes into account 
self-consistently charge transfer effects which are impor- 
tant for biological systems. The method gives results 
for the band gaps that are in excellent agreement with 
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those of the ab initio approaches described above (see 
Refs. HEol). 

The highest occupied and lowest unoccupied molecu- 
lar orbitals (HOMO and LUMO, respectively, also re- 
ferred to collectively as "frontier states" in the following) 
are extended over the entire structure in Bloch-like wave 
functions. In order to describe electron hopping and lo- 
calization, we need to express these in terms of a basis 
of Wannier-like states that are localized on the individ- 
ual bases. To this end, we construct maximally localized 
states on single base pairs by taking linear combinations 
of the HOMO and LUMO states from the wavefunctions 
of Eq. (pQ). The maximally localized states will then be 
used to calculate the hopping parameters in the effective 
ID hamiltonian. Using the extended electronic states 
of the frontier states, with corresponding ener- 
gies £( n ), we define the maximally localized states 
through the unitary transformation 

\^) = J2(^ n) \4> {i W n) ) (2) 

n 

which minimizes the sum of the variances 

c = ^((^)|z 2 iv;( i ))-(^)|i|^)) 2 ) (3) 

i 

under the constraint (^W^W) = Sij where z is the po- 
sition along the helical axis. Similar and more general 
methodologies have been developed in the past for ob- 
taining maximally localized states from extended ones 
[4l|,[42[. Due to the invar iance of the trace, the first term 
in Eq. (j3]) is independent of the unitary transformation 
and the problem is simplified to one of maximizing the 
second term on the right-hand side with the same or- 
thonormality constraint. Carrying out the minimization, 
we arrive at the equation 

(fl n ')\z\& m ))(z n -z m ) = (4) 

where 

z n = {^\z\^). (5) 

By inspection, we see that £ is maximized when z n — z m 
for all m and n, corresponding to maximally delocalized 
states. On the other hand, £ is minimized when the states 
|?/n n )) are the eigenfunctions of the position operator z 
within the HOMO or LUMO subspace. Therefore, the 
problem is further reduced to constructing and diagonal- 
izing the matrix 

M nTO = (^ (n W m) > (6) 

which has the eigenvectors (ip^\^^) that provide the 
desired transformation given in Eq. ([2]). The eigenvalues 
Zfi are the positions of the localized states. To evaluate 
the matrix elements we use the approximation 

(^\W {m) ) = £4 n) *4 m) <^l^> 

jJLU 



where S^ v = (ipn\ip u ) is the overlap matrix between the 
two atomic orbitals and z^ v = z ^ Zv is the average z- 
value for the atoms located at sites given by the labels 
fx and v. Once the localized states are constructed, the 
hopping parameters can be computed as 

Uj = (^\H\ft j) ) = ^^ n H^ {l) \^ n) )(^ n) \^ j) ) (8) 

n 

recalling that the quantities (^ n ^\ip^) are determined 
from the transformation described above. 

Having defined the maximally localized states in terms 
of the electronic wavefunctions from the all-atom calcula- 
tions, we next produce an effective tight-binding hamilto- 
nian, which allows us to study electron hopping along the 
DNA double helix. This approach has also been used in a 
recent study on functionalized carbon nanotubes [43[ . In 
our effective hamiltonian, we consider hopping between 
first and second neighbors along the helix, and denote 
the hopping matrix elements according to the scheme 
shown in Fig. [2] for the HOMO state of the poly(CG)- 
poly(CG) structure (all other frontier states involve ex- 
actly the same type of hopping matrix elements): 

H = eJ2 C ti C n + h ^ ( C n C n+l + C* +1 C n ) 
n n even 

+ t2 ( C n C n+l + cl+iCn) 

n odd 

+ * 3 ( C n C ^+ 2 + C n+2 C n) (9) 
n 

where n represents the n th base pair along the helical 
axis and we have neglected spin indices because they are 
unimportant for our analysis. Note that there is a dif- 
ference between hopping elements connecting even and 
odd sites to their neighbors (t\ and ti terms in the ef- 
fective hamiltonian of Eq. ([9])), due to the asymmetry in 
the structure illustrated in Fig. [2l Performing a Fourier 
transform on the electron creation and annihilation op- 
erators 

* n 

gives a hamiltonian which has coupling between momenta 
k and k + ir/a. By doubling the unit cell (and reducing 
the Brillouin Zone by a factor of two), this can finally be 
diagonalized to obtain the eigenvalues 

E± = e + 2t 3 cos(2k)± ^t? + £ 2 2 + 2 t x t 2 cos(2fc)(ll) 

with the momentum sum carried out over the reduced 
Brillouin Zone. With these expressions for the band 
structure energies, the density of states (DOS) 

ffM = ^£*("-££°) (12) 

k,n 

can be readily obtained. These quantities are essential 
in describing electron localization along the DNA double 
helix under different conditions. 
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C. Disorder and Localization length 

In order to quantify the amount of localization that is 
expected in stretched DNA forms, we add a term to the 
hamiltonian in Eq. ([9]) of the form 



(13) 



which is meant to emulate disorder arising from a variety 
of sources such as interaction of the DNA bases with 
stray water molecules and ions, or interaction with the 
substrate. U n are uncorrelated random energy variations 
chosen according to a Gaussian distribution of zero mean 
and width 7 



P(U) = 



1 



exp 



~2j 2 



(14) 



Once the disorder hamiltonian is constructed with a spe- 
cific set of random on-site energies, by direct diagonaliza- 
tion we find the eigenstates |\E^)) of H+Hdis (we use cap- 
ital symbols to denote the new wavefunctions from the 
hamiltonian that includes the disorder term) and then 
calculate the localization length defined as 



«|n 2 | 



*W> - <*W|n|*W) s 



1/2 



where 



— ^ ] nC n C n- 



(15) 



(16) 



For a single- hopping model with weak disorder, the lo- 
calization length scales as L ~ (t/j) for electrons near 
the middle of the band [441 ]. with t the hopping matrix 
element which determines the band width. The more 
complicated effective hamiltonian considered here is not 
amenable to simple analytic treatment. 



III. RESULTS AND DISCUSSION 

We begin our discussion with an overview of electronic 
states in single bases and isolated base pairs. The struc- 
ture of the base pairs is shown in Fig. [3] with the atoms 
in each base labeled for future reference. These calcula- 
tions will set the stage for a proper interpretation of the 
behavior in the stretched and unstretched dry, neutral 
DNA helix. 



A. Frontier states 

The frontier states in the base pairs are related to only 
one component of the pair for both AT and CG. This is 
shown in Fig. 0J Specifically, the HOMO state of the AT 
pair is exactly the same as that of the HOMO state of 
the isolated A, and the LUMO state of AT the same as 



that of the isolated T. Similarly, the HOMO state of CG 
is identified with that of the isolated G and the LUMO 
state with that of the isolated C. Thus, the purines (A or 
G) give rise to the HOMO state, while the pyrimidines 
(T or C) are responsible for the LUMO states of each 
pair. It is clear from the same figure, that essentially all 
atomic p z orbitals which belong to a purine or pyrimidine 
contribute to the respective HOMO or LUMO tt state of 
the base pair. This is in agreement with calculations on 
the optical absorption spectra of DNA bases and base 
pairs [SBj]. A closer inspection of Fig. [H shows that the 
molecular frontier states of both AT and CG can be iden- 
tified as similar contributions (up to sign changes) from 
specific groups of carbon and nitrogen atoms. Specif- 
ically, in the purines (A and G) three distinct groups 
of atoms are mainly involved in forming the HOMO or- 
bital and include atoms (C8-N7), (C2-N3) and (N1-C6- 
C5-C4-N9), respectively. In the pyrimidines (T and C) 
the main groups involved in forming the LUMO orbital 
are two, (C4-C5-N1) and (N3-N7-C6). In both base pairs 
the atoms that are less involved in the frontier molecu- 
lar states are the carbon atoms that form a double bond 
with an oxygen atom, such as C2 of A and C and the 
four-fold bonded C7 atom of A. 

The frontier states are very little affected when the 
two components of the base-pair are separated along the 
direction in which they are hydrogen-bonded. To demon- 
strate this, we show in Fig. [5] the change in the eigenval- 
ues of the frontier states in AT and CG as a function of 
the distance between the two atoms that are bonded to 
the two backbones (we call this the backbone distance). 
For both base pairs the nitrogen atoms labeled Nl and 
N9, are the ones attached to the backbone (see Fig. [3|). 
In order to obtain realistic structures, for each value of 
the backbone distance we hold the atoms of each base 
that are bonded to the backbone fixed and allow all other 
atoms to relax fully. These calculations were performed 
with the SIESTA code [33] and the relaxed configurations 
were used as input to calculate the electronic structure 
with the other two methodologies [H, [35[. In Fig. [5] we 
show complete results from the SIESTA calculations and 
selected results from one of the other two approaches. 

The results of Fig. [5] show clearly that only in the re- 
gion where the backbone distance becomes significantly 
smaller than the equilibrium value, interaction between 
the two bases shifts the eigenvalues of the electronic 
states appreciably, but even then the shifts are relatively 
small for the frontier states. It is also noteworthy that 
the band gap of the AT pair is significantly larger (~ 3 
eV) than that of the CG pair (~ 2 eV) and that the fron- 
tier states of CG lie within the band gap of the AT pair. 
This observation is important because it indicates that 
in an arbitrary sequence of base pairs, the frontier states 
will be associated with those of the CG pairs. This state- 
ment is verified by calculations of electronic states in the 
AT-AT, CG-CG and AT-CG base pair combinations, to 
which we turn next. 

For more detailed comparisons, we collect in Table [I] 



5 





min 


HOMO 


LUMO 


gap 


Backbone distance 










AT 


8.67 A 


-1.63 


1.60 


3.23 


CG 


8.73 A 


-0.80 


1.31 


2.11 


Axial distance 










AT-AT 


3.67 A 


-1.33 


1.37 


2.70 


CG-CG 


3.52 A 


-0.46 


0.95 


1.41 


AT-CG 


3.36 A 


-0.71 


1.00 


1.71 


Rotation angle 












36° 


-1.48 


1.58 


3.06 


AT-AT 


108° 


-1.45 


1.68 


3.13 




180° 


-1.55 


1.63 


3.18 




36° 


-0.52 


1.22 


1.74 


CG-CG 


108° 


-0.64 


1.54 


2.18 




180° 


-0.94 


1.60 


2.54 




36° 


-0.86 


1.51 


2.37 


CG-GC 


108° 


-0.66 


1.43 


2.09 




180° 


-0.60 


1.12 


1.72 




36° 


-0.73 


1.38 


2.11 


AT-CG 


108° 


-0.59 


1.27 


1.86 




180° 


-0.81 


1.25 


2.06 





HOMO 


LUMO 


e(eV) 


3.12 


-0.09 


ti (meV) 


14.0 


-0.29 


t 2 (meV) 


2.60 


0.04 


t 3 (meV) 


0.09 


0.26 



TABLE I: Eigenvalues (in eV) of the frontier states for the 
DNA base pairs and the base-pair combinations, at the equi- 
librium configurations for the backbone distance, the axial 
distance (at zero relative angle of rotation) and the angle of 
rotation (at the equilibrium axial distance). The column la- 
beled "min" gives the values of the distances and the angle at 
the equilibrium configurations. Due to symmetry the values 
for the minima at rotation angles larger than 180° are similar 
to those given here and are not shown. 



the eigenvalues of the frontier states for the DNA pairs 
and the pair combinations, at different equilibrium con- 
figurations in the three relevant variables, the backbone 
distance, the axial distance and the rotation angle. Some 
results on the CG-GC base pair combination are also 
shown, to allow for comparison to the poly(C)-poly(G) 
sequence. 

When two base pairs are stacked on top of each other, 
there are two degrees of freedom for motion of one relative 
to the other: a separation along the helical axis, which 
we will call axial distance, and a relative rotation around 
the helical axis. We take the helical axis to be that which 
corresponds to stacking of successive base pairs in the B 
form of the DNA double helix. According to the notation 
of Fig. [3l the helical axis for both base-pairs is normal 
to the line connecting atoms C4 and C6 and is closer 
(about one third of their distance) to the purine atom 
C6. For each configuration we fix the atoms that are 
bonded to the backbone at a given relative position and 
allow all other atoms to relax, as was done in the calcu- 
lations involving the backbone distance discussed above. 
In Fig. [6] we show the behavior of electronic eigenval- 
ues as a function of the axial distance and the rotation 



TABLE II: Parameters for the on-site (e) and hopping matrix 
elements (U, i = 1, 2, 3), for the HOMO and LUMO states of 
unstretched poly(CG)-poly(CG) DNA. 



angle. As above, the eigenvalues show little dependence 
on these two variables, except for rather small values of 
the axial distance which correspond to unphysically small 
separation between the two base pairs. 

What is also remarkable in the above results, is that 
in the AT-CG combination, the frontier states are clearly 
identified with those corresponding to the CG pair ex- 
clusively, which has the smaller band gap (see Fig. [6]). 
Moreover, we note that the band gap of the poly(C)- 
poly(G) sequence, as calculated by the semi-empirical 
method based on a minimal atomic orbital basis [32[ is 
in excellent agreement with the value obtained from the 
SIESTA calculation (2.0 eV and 2.1 eV, respectively). 
The band gap is expected to be significantly smaller in 
the case of wet DNA and in the presence of counterions, 
as shown in Ref. [46[ , for a Z-DNA helix. The band gaps 
between all three ab initio methods are identical within 
the accuracy of these methods. The nature of electronic 
wavefunctions obtained by the different methods is also 
in good qualitative agreement. Accordingly, in the rest 
of this paper we focus our attention to electron local- 
ization in the dry, neutral poly(CG)-poly(CG) sequence, 
and employ the results of the semi-empirical electronic 
structure method. 



B. Hopping electrons 

In Fig. we show the unstretched and the three 
stretched forms of the poly(CG)-poly(CG) sequences at 
30%, 60%, 90% elongation, along with the features of 
the frontier states. For visualization purposes, we repre- 
sent the calculated wavefunction magnitude of the fron- 
tier states by blue (HOMO) and red (LUMO) spheres, 
centered at the sites where the atomic orbitals are lo- 
cated. The radius of the sphere centered on a particu- 
lar atom is proportional to the magnitude of the dom- 
inant coefficient |ci/ n ^| 2 at this site (see Eq. (pQ)), which 
is essentially proportional to the local electronic density. 
It is evident from this figure that the nature of the or- 
bitals themselves, represented by the radii of the colored 
spheres, does not change much in the different stretched 
DNA forms, but the overlap between orbitals at neigh- 
boring bases is affected greatly by the amount of stretch- 
ing. For the poly(CG)-poly(CG) sequence shown, the 
HOMO orbitals are always associated with the G sites 
for all the stretching modes, while the LUMO orbitals 
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are related to the C sites. However, as the DNA be- 
comes more elongated, the orbitals overlap even less and 
become localized for high stretching modes. The elon- 
gation to the overstretched form is achieved by changing 
the dihedral angle configuration of the DNA backbone, 
which leaves the local part of the orbitals essentially in- 
tact. Note how the orbitals rotate and spread out as the 
structure is being ovestretched, following the rotation of 
bases. 

We now turn to a discussion of the results for the hop- 
ping matrix elements of Eq. ([9j). Our discussion here 
is relevant to what happens when the occupation of a 
frontier state is changed from complete filling (for the 
HOMO) or complete depletion (for the LUMO), that is, 
the physics of small amounts of hole or electron doping. 
In Table HT1 we give the values for £,ti,£2,fe (see Fig. [2j) 
for the two frontier states of the unstretched poly(CG)- 
poly(CG) DNA form. The hopping matrix elements for 
the HOMO state involve only the G sites; those for the 
LUMO state involve only the C sites. As a consistency 
check, we have also calculated matrix elements for farther 
neighbors and found those to be much smaller in magni- 
tude. We have calculated the values of ti, £2, ^3 by repeat- 
ing the same procedure as above for the stretched forms 
of the poly(CG)-poly(CG) DNA sequence. We note that 
if t2 — ts = electrons will not be able to migrate along 
the DNA molecule even if t\ is quite large, because at 
least one of the other two hops is necessary for migra- 
tion (see Fig. [2J. From this simple picture, it is evident 
that the conductivity will be determined by which ma- 
trix element dominates. Quantitatively, the "bottleneck" 
hopping matrix element is given by 

t = max(min(|£i|,|£ 2 |),|t 3 |). (17) 

In Fig. [8] we show the value of the "bottleneck" hop- 
ping matrix element calculated as a function of stretch- 
ing. This indicates that hopping conductivity will dra- 
matically decrease by several orders of magnitude upon 
stretching the molecule and that the hopping will de- 
crease more from stretching in the 3 '-3' mode than in 
the 5 '-5' mode. This is due to the conformational changes 
induced by the different stretching modes, described ear- 
lier. 



C. Localization length 

The significant dropping of the hopping matrix ele- 
ments upon stretching as described in the previous sec- 
tion is indicative of electron localization with a weak 
amount of disorder. To investigate this possibility in 
detail, we focus on effects of stretching in the 3 '-3' 
mode. The evolution of the density of HOMO states 
upon stretching is shown in Fig. [9j similar behavior is 
observed for the LUMO states. The dramatic narrowing 
of the DOS width (equivalent to reduced dispersion in a 
band-structure picture) is strongly suggestive of electron 
localization [47[, in this case induced by stretching. This 



localization length is controlled by the hopping elements 
t, since e is the same at each site. 

For a more quantitative description, we show in Fig. [9] 
the localization length Li for each eigenstate for a 1500 
base-pair DNA strand under different amounts of stretch- 
ing. The value of LW for each state is obtained from 
Eq. ([T5]h with disorder strength 7 = 0.3 meV, which de- 
termines the width of the gaussian given in Eq. ([H]) . This 
disorder strength is much smaller than the band width 
of the unstretched DNA, but becomes comparable to the 
band width as the molecule is stretched. The magni- 
tude of such variations in on-site energies is consistent 
with those produced by the dipole potential terms, for 
instance, due to the presence of a stray water molecule 
situated on the substrate roue; hly 15 A away from the 
DNA bases. We find that changing the value of 7 by an 
order of magnitude (either smaller or larger) does not af- 
fect the qualitative picture presented here. Note that the 
localization length is not a strict function of the energy, 
as it depends on the disorder near where a given state 
happens to be localized. As the molecule is stretched, 
the localization length dramatically decreases until, for 
60% stretching, the eigenstates are completely localized 
on single base pairs. 

The charge localization length as a function of DNA 
stretching has been recently studied in the experiment 
of Heim et al. [48] . This study focuses on A-DNA which 
has an irregular sequence of base pairs, and can be com- 
pared to our theoretical results for poly(CG)-poly(CG) 
recalling that the frontier states even for a random se- 
quence are associated with those of the CG base-pairs. 
In the experiment, ropes of A-DNA on a substrate are 
overstretched by a receding meniscus technique. The 
DNA ropes in this experimental setup are slightly pos- 
itively charged, corresponding to a depletion of a few 
electrons per 1000 base pairs. We suggest that this situa- 
tion is approximated by the structures of dry and neutral 
DNA that we considered above. Electrons were injected 
into the DNA and the resulting localization length was 
measured by an electron force microscope. For the un- 
stretched DNA, the charge was found to delocalize across 
the entire molecule, extending over a length of several 
microns. On the other hand, the charge injected into 
the overstretched DNA is localized, extending over a few 
hundred nanometers only. This is qualitatively consistent 
with the picture that emerges from our theoretical analy- 
sis, and is even in reasonable quantitative agreement: the 
degree of localization in experiment, measured by the ra- 
tio of length scales going from unstretched to stretched 
DNA structures, is approximately two orders of magni- 
tude, while the same quantity in our calculations, going 
from unstretched to 60% stretched DNA is ~ 10 3 . 



IV. SUMMARY 

We have described and implemented a multiscale 
method to derive effective hamiltonian models that are 
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able to capture the dynamics of conduction and valence 
electrons in stretched DNA, starting from ab initio, all- 
atom quantum mechanical calculations. The ab initio 
simulations revealed that the frontier states in the base 
pairs are related to only one component of the pair. The 
purines were found to be associated with the HOMO 
states while the pyrimidines with the LUMO states. In 
the AT-CG combination the frontier states are identified 
with those of the CG pair. For all combinations of bases 
and base pairs studied here, the nature of these states was 
not affected by separation of the bases or base pairs along 
different directions or rotation along the helical axis. 

Turning to the next length scale and the semi-empirical 
calculations, we have calculated the "bottleneck" matrix 
elements for electron hopping along the DNA molecule, 
as a function of stretching. These show a significant 
decrease with elongation of DNA, which is stronger for 



stretching in the 3'-3' mode than in the 5'-5' mode. We 
were able to show quantitatively that stretching of DNA 
dramatically narrows the DOS width of frontier states. 
A small amount of disorder produced by environmental 
factors will naturally lead to localization of the electrons 
along the DNA. Our estimate for the degree of localiza- 
tion, based on a reasonable (and quite small) amount of 
disorder in the on-site energies for the electron states, 
is in very good agreement with recent experimental ob- 
servations. This provides direct validation for the con- 
sistency and completeness of the multiscale method pre- 
sented here. 
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FIG. 1: Schematic illustration of the different scales included 
in the current multiscale model: The two pictures on the left 
are atomistic systems simulated with different computational 
approaches (ab initio density functional theory and semi- 
empirical electronic structure, resprectively) . The picture on 
the right represents a rope composed of DNA molecules, as in 
experiments [48j] , which is treated by an effective tight-binding 
hamiltonian constructed from the atomistic scale calculations. 



FIG. 2: Schematic depiction of electron hopping in poly(CG)- 
poly(CG) DNA for the HOMO state. The hopping matrix 
elements U are denoted by the indices (i) = (1), (2), (3). Elec- 
trons are localized on the G bases. For the LUMO state, the 
hopping is similar with electrons localized on the C bases. 




FIG. 3: The DNA base pairs AT (top) and CG (bottom), with 
the atoms labeled. The purines (A, G) are on the right, the 
pyrimidines (T, C) on the left. Atom labeling follows standard 
notation convention [4^]. All rotations were performed with 
respect to the helical axis denoted by the black circle (see 
text). 




FIG. 4: The frontier states in the base pairs and their identifi- 
cation with corresponding orbit als in the isolated bases. The 
middle figure in each panel shows the total charge density on 
the plane of the base pair, with higher values of the charge 
density in red and lower values in blue. The figure on the left 
shows the HOMO state and the figure on the right shows the 
LUMO state, where red and blue isosurfaces correspond to 
positive and negative values of the wavefunctions. The labels 
on the left denote the type of bases and base pairs. 
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FIG. 5: Eigenvalues of states in the AT and CG base pairs 
as a function of backbone distance. In each case three states 
are included above and below the band gap. Lines are results 
from SIESTA calculations, points are results from HARES 
calculations (see text). The frontier orbitals in both pairs 
are related to one component of the pair as indicated by the 
labels. The equilibrium backbone distance is denoted by a 
vertical dashed line. 
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FIG. 6: Eigenvalues of states in the AT- AT, CG-CG and AT- 
CG base pair combinations as a function of the distance along 
the helical axis (at zero angle of rotation) and the rotation an- 
gle around the helical axis (at the equilibrium axial distance) . 
Lines are results from SIESTA calculations, points are results 
from VASP calculations (see text). In each case three states 
are included above and below the band gap. The value of the 
distance or the rotation angle that correspond to equilibrium 
configurations are indicated by vertical dashed lines (there are 
five almost equivalent local minima in rotation). As in Fig. 
[5l frontier orbitals are identified as the corresponding orbital 
of one base only. 




FIG. 7: The DNA structures for the unstretched (top) and the 
different amounts of stretching in the 3'-3' and the 5'-5' modes 
with features of the frontier orbitals described by the blue 
(HOMO) and red (LUMO) spheres (see text for details). For 
both modes the amount of stretching is (a) 30%, (b) 60%, and 
(c) 90% relative to the unstretched structure, which is the B- 
DNA form. The 3'-5' orientations of the poly(CG)-poly(CG) 
sequence are shown in the left panel at 90% stretching, where 
these the structure is easier to visualize. 
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FIG. 8: The frontier state "bottleneck" hopping matrix el- 
ements as given by Eq. ([17)) for the different types (3 '-3' or 
5'-5') and amounts of stretching of poly(CG)-poly(CG) DNA. 
At each value of stretching, the dominant hopping process is 
indicated in parenthesis. 
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FIG. 9: (bottom) The density of electronic states for the 
HOMO state stretched in the 3'-3' mode. For comparison, 
the on-site energy parameter, e, has been set to zero, (top) 
The localization length Li, defined in Eq. ([15)) . is computed 
for each eigenstate with disorder strength 7 = 0.3 meV. 



